Search CORE

2,427 research outputs found

Implicit feature detection for sentiment analysis

Author: Frasincar F. (Flavius)
Schouten K.I.M. (Kim)
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2014
Field of study

Implicit feature detection is a promising research direction that has not seen much research yet. Based on previous work, where co-occurrences between notional words and ex- plicit features are used to find implicit features, this research critically reviews its underlying assumptions and proposes a revised algorithm, that directly uses the co-occurrences be- Tween implicit features and notional words. The revision is shown to perform better than the original method, but both methods are shown to fail in a more realistic scenario

EUR Research Repository

Erasmus University Digital Repository

Dynamics and tipping point of issue attention in newspapers: quantitative and qualitative content analysis at sentence level in a longitudinal study using supervised machine learning and big data

Author: Opperhuizen A.E. (Alette)
Schouten K.I.M. (Kim)
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 20/05/2020
Field of study

This study aims to provide a more sensitive understanding of the dynamics and tipping points of issue attention in news media by combining the strengths of quantitative and qualitative research. The topic of this 25-year longitudinal study is the volume and the content of newspaper articles about the emerging risk of gas drilling in The Netherlands. We applied supervised machine learning (SML) because this allowed us to study changes in the quantitative use of subtopics at the detailed sentence level in a large number of articles. The study shows that the actual risk of drilling-induced seismicity gradually increased and that the volume of newspaper attention for the issue also gradually increased for two decades. The sub-topics extracted from media articles during the low media attention period, covering factual information, can b

Erasmus University Digital Repository

Senior Recital

Author: Hammer Shaina
Kasraie Samira
Kim Hye-Young
Schouten Rachelle
Publication venue: Chapman University Digital Commons
Publication date: 06/03/2015
Field of study

Chapman University Digital Commons

Semantics-Driven Aspect-Based Sentiment Analysis

Author: Schouten K.I.M. (Kim)
Publication venue: People using the Web are constantly invited to share their opinions and preferences with the rest of the world, which has led to an explosion of opinionated blogs, reviews of products and services, and comments on virtually everything. This type of web-based content is increasingly recognized as a source of data that has added value for multiple application domains. While the large number of available reviews almost ensures that all relevant parts of the entity under review are properly covered, manually reading each and every review is not feasible. Aspect-based sentiment analysis aims to solve this issue, as it is concerned with the development of algorithms that can automatically extract fine-grained sentiment information from a set of reviews, computing a separate sentiment value for the various aspects of the product or service being reviewed. This dissertation focuses on which discriminants are useful when performing aspect-based sentiment analysis. What signals for sentiment can be extracted from the text itself and what is the effect of using extra-textual discriminants? We find that using semantic lexicons or ontologies, can greatly improve the quality of aspect-based sentiment analysis, especially with limited training data. Additionally, due to semantics driving the analysis, the algorithm is less of a black box and results are easier to explain.
Publication date: 16/11/2018
Field of study

People using the Web are constantly invited to share their opinions and preferences with the rest of the world, which has led to an explosion of opinionated blogs, reviews of products and services, and comments on virtually everything. This type of web-based content is increasingly recognized as a source of data that has added value for multiple application domains. While the large number of available reviews almost ensures that all relevant parts of the entity under review are properly covered, manually reading each and every review is not feasible. Aspect-based sentiment analysis aims to solve this issue, as it is concerned with the development of algorithms that can automatically extract fine-grained sentiment information from a set of reviews, computing a separate sentiment value for the various aspects of the product or service being reviewed. This dissertation focuses on which discriminants are useful when performing aspect-based sentiment analysis. What signals for sentiment can be extracted from the text itself and what is the effect of using extra-textual discriminants? We find that using semantic lexicons or ontologies, can greatly improve the quality of aspect-based sentiment analysis, especially with limited training data. Additionally, due to semantics driving the analysis, the algorithm is less of a black box and results are easier to explain

EUR Research Repository

Erasmus University Digital Repository

Framing a Conflict! How Media Report on Earthquake Risks Caused by Gas Drilling: A Longitudinal Analysis Using Machine Learning Techniques of Media Reporting on Gas Drilling from 1990 to 2015

Author: Klijn E-H. (Erik-Hans)
Opperhuizen A.E. (Alette)
Schouten K.I.M. (Kim)
Publication venue: 'Informa UK Limited'
Publication date: 12/01/2018
Field of study

Using a new analytical tool, supervised machine learning (SML), a large number of newspaper articles is analysed to answer the question how newspapers frame the news of public risks, in this case of ea

Erasmus University Digital Repository

A Dependency Graph Isomorphism for News Sentence Searching

Author: Flavius Frasincar
Kim Schouten
Publication venue
Publication date: 05/03/2020
Field of study

Abstract. Given that the amount of news being published is only increasing, an effective search tool is invaluable to many Web-based companies. With word-based approaches ignoring much of the information in texts, we propose Destiny, a linguistic approach that leverages the syntactic information in sentences by representing sentences as graphs with disambiguated words as nodes and grammatical relations as edges. Destiny performs approximate sub-graph isomorphism on the query graph and the news sentence graphs, exploiting word synonymy as well as hypernymy. Employing a custom corpus of user-rated queries and sentences, the algorithm is evaluated using the normalized Discounted Cumulative Gain, Spearman's Rho, and Mean Average Precision and it is shown that Destiny performs significantly better than a TF-IDF baseline on the considered measures and corpus

CiteSeerX

Constraints on the sources of branched tetraether membrane lipids in distal marine sediments

Author: Kim J.-H.
Schefuß E.
Schouten S.
Sinninghe Damsté J.
Weijers J.W.H.
Publication venue
Publication date: 01/01/2014
Field of study

Branched glycerol dialkyl glycerol tetraethers (brGDGTs) are membrane lipids produced by soil bacteria and occur in near coastal marine sediments as a result of soil organic matter input. Their abundance relative to marine-derived crenarchaeol, quantified in the BIT index, generally decreases offshore. However, in distal marine sediments, low relative amounts of brGDGTs can often still be observed. Sedimentary in situ production as well as dust input have been suggested as potential, though as yet not well constrained, sources. In this study brGDGT distributions in dust were examined and compared with those in distal marine sediments. Dust was sampled along the equatorial West African coast and brGDGTs were detected in most of the samples, albeit in low abundance. Their degree of methylation and cyclisation, expressed in the MBT' (methylation index of branched tetraethers) and DC (degree of cyclisation) indices, respectively, were comparable with those for African soils, their presumed source. Comparison of DC index values for brGDGTS in global soils, Congo deep-sea river fan sediments and dust with those of distal marine sediments clearly showed, however, that distal marine sediments had significantly higher values. This distinctive distribution is suggestive of sedimentary in situ production as a source of brGDGTs in marine sediments, rather than dust input. The presence of in situ produced brGDGTs in marine sediments means that caution should be exercised when applying the MBT'–CBT palaeothermometer to sediments with low BIT index values, i.e. < 0.1, based on our dataset

Evaluation of long chain 1,14-alkyl diols in marine sediments as indicators for upwelling and temperature

Author: Kim Jung-Hyun
Mollenhauer Gesine
Rampen Sebastiaan W
Rodrigo-Gámiz Marta
Schefuß E.
Schouten Stefan
Schouten Stefan
Sinninghe Damsté J. S.
Uliana Eleonora
Willmott Veronica
Publication venue
Publication date: 01/01/2014
Field of study

Long chain alkyl diols form a group of lipids occurring widely in marine environments. Recent studies have suggested several palaeoclimatological applications for proxies based on their distributions, but have also revealed uncertainty about their applicability. Here we evaluate the use of long chain 1,14-alkyl diol indices for reconstruction of temperature and upwelling conditions by comparing index values, obtained from a comprehensive set of marine surface sediments, with environmental factors such as sea surface temperature (SST), salinity and nutrient concentration. Previous studies of cultures indicated a strong effect of temperature on the degree of saturation and the chain length distribution of long chain 1,14-alkyl diols in Proboscia spp., quantified as the diol saturation index (DSI) and diol chain length index (DCI), respectively. However, values of these indices for surface sediments showed no relationship with annual mean SST of the overlying water. It remains unknown as to what determines the DSI, although our data suggest that it may be affected by diagenesis, while the relationship between temperature and DCI may be different for different Proboscia species. In addition, contributions from algae other than Proboscia diatoms may affect both indices, although our data provide no direct evidence for additional long chain 1,14-alkyl diol sources. Two other indices using the abundance of 1,14-diols vs. 1,13-diols and C30 1,15-diols have been applied previously as indicators for upwelling intensity at different locations. The geographical distribution of their values supports the use of 1,14 diols vs. 1,13 diols [C28 + C30 1,14-diols]/[(C28 + C30 1,13-diols) + (C28 + C30 1,14-diols)] as a general indicator for high nutrient or upwelling conditions

Electronic Publication Information Center

Publishing Network for Geoscientific and Environmental Data

Utrecht University Repository

Active control of focal length and beam deflection in a metallic nano-slit array lens with multiple sources

Author: A. E. Çetin
Caglayan
Cetin
Ebbesen
Feng
Gan
Garcia-Vidal
Jackson
Janssen
K. Güven
Kim
Lezec
Liu
Martin-Moreno
Min
Rebollm
Schouten
Shahmoon
Shi
Sun
Yu
Zheng
Zon
Ö. E. Müstecaplıoğlu
Publication venue: 'The Optical Society'
Publication date: 21/10/2009
Field of study

We propose a surface plasmon-polariton based nano-rod array lens structure that incorporates two additional lateral input channels, with the ability to control the focal length and the deflection of the transmitted beam through the lens actively by the intensity of the channel sources. We demonstrate by numerical simulations that, applying the sources with the same intensity can change the focal point and the beam waist, whereas unequal intensities generate an asymmetric field profile in the nano-rod array inducing an off-axis beam deflection.Comment: 4 pages, 5 figure

arXiv.org e-Print Archive

Crossref

Using linguistic graph similarity to search for sentences in news articles

Author: Frasincar F. (Flavius)
Schouten K.I.M. (Kim)
Publication venue: 'IOS Press'
Publication date: 01/01/2016
Field of study

With the volume of daily news growing to sizes too big to handle for any individual human, there is a clear need for effective search algorithms. Since traditional bag-of-words approaches are inherently limited since they ignore much of the information that is embedded in the structure of the text, we propose a linguistic approach to search called Destiny in this paper. With Destiny, sentences, both from news items and the user queries, are represented as graphs where the nodes represent the words in the sentence and the edges represent the grammatical relations between the words. The proposed algorithm is evaluated against a TF-IDF baseline using a custom corpus of user-rated sentences. Destiny significantly outperforms TF-IDF in terms of Mean Average Precision, normalized Discounted Cumulative Gain, and Spearman's Rho

EUR Research Repository

Erasmus University Digital Repository